Blue ∗ : A Unified Programming Model for Diverse Data-intensive Cloud Computing Paradigms

نویسندگان

  • Maneesh Varshney
  • Vishwa Goudar
چکیده

Several computational paradigms exist today for processing large volumes of data on a cluster of resources: batch processing, iterative, interactive, memory-based, data flow oriented, relational, structured, etc. A unified system or framework that supports all paradigms would be greatly beneficial, as it will allow running different algorithms on the same dataset, reduce the system deployment, maintenance and training costs, provide a common platform for research and enable advances to be quickly incorporated. In this paper, we propose an approach towards such a system by introducing a unified programming model, called Blue. The model supports diverse paradigms of data-intensive computation, under the assumption that a program can be decomposed into a dependency graph of component tasks. In particular, Blue is capable of modeling iterative problems by a novel approach of unfolding cyclic graph into an unbounded acyclic graph. We illustrate the Blue model for several paradigms, such as Map-Reduce, joining datasets, Pregel, iterative algorithms such as k-Means and interactive querying. The model can be efficiently implemented by analyzing the graph and scheduling resources in manner that benefit from data and network locality. Additionally, Blue supports in-memory caching of data, explicitly by programmer or opportunistically by system, which has been shown to greatly improve the latency and throughput of interactive and iterative programs [1]. Finally, the Blue model provides simple and consistent semantics for fault-tolerance of acyclic as well as cyclic dependency graphs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cloud Computing Technology Algorithms Capabilities in Managing and Processing Big Data in Business Organizations: MapReduce, Hadoop, Parallel Programming

The objective of this study is to verify the importance of the capabilities of cloud computing services in managing and analyzing big data in business organizations because the rapid development in the use of information technology in general and network technology in particular, has led to the trend of many organizations to make their applications available for use via electronic platforms hos...

متن کامل

Data Replication-Based Scheduling in Cloud Computing Environment

Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...

متن کامل

Integrated modeling and solving the resource allocation problem and task scheduling in the cloud computing environment

Cloud computing is considered to be a new service provider technology for users and businesses. However, the cloud environment is facing a number of challenges. Resource allocation in a way that is optimum for users and cloud providers is difficult because of lack of data sharing between them. On the other hand, job scheduling is a basic issue and at the same time a big challenge in reaching hi...

متن کامل

A Model based on Cloud Computing for the implementation and management IT services in Banks

In recent years, the banking industry has made significant changes in technology and communications. The expansion of electronic communications and a large number of people around the world access to the Internet, appropriate to establish trade and economic exchanges provided but high costs, lack of flexibility and agility in existing systems because of the large volume of information, confiden...

متن کامل

A Model based on Cloud Computing for the implementation and management IT services in Banks

In recent years, the banking industry has made significant changes in technology and communications. The expansion of electronic communications and a large number of people around the world access to the Internet, appropriate to establish trade and economic exchanges provided but high costs, lack of flexibility and agility in existing systems because of the large volume of information, confiden...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013